The necessity of adjusting tests of protein category enrichment in discovery proteomics
نویسندگان
چکیده
MOTIVATION Enrichment tests are used in high-throughput experimentation to measure the association between gene or protein expression and membership in groups or pathways. The Fisher's exact test is commonly used. We specifically examined the associations produced by the Fisher test between protein identification by mass spectrometry discovery proteomics, and their Gene Ontology (GO) term assignments in a large yeast dataset. We found that direct application of the Fisher test is misleading in proteomics due to the bias in mass spectrometry to preferentially identify proteins based on their biochemical properties. False inference about associations can be made if this bias is not corrected. Our method adjusts Fisher tests for these biases and produces associations more directly attributable to protein expression rather than experimental bias. RESULTS Using logistic regression, we modeled the association between protein identification and GO term assignments while adjusting for identification bias in mass spectrometry. The model accounts for five biochemical properties of peptides: (i) hydrophobicity, (ii) molecular weight, (iii) transfer energy, (iv) beta turn frequency and (v) isoelectric point. The model was fit on 181 060 peptides from 2678 proteins identified in 24 yeast proteomics datasets with a 1% false discovery rate. In analyzing the association between protein identification and their GO term assignments, we found that 25% (134 out of 544) of Fisher tests that showed significant association (q-value ≤0.05) were non-significant after adjustment using our model. Simulations generating yeast protein sets enriched for identification propensity show that unadjusted enrichment tests were biased while our approach worked well.
منابع مشابه
Pharmaceutical Advances and Proteomics Researches
Proteomics enables understanding the composition, structure, function and interactions of the entire protein complement of a cell, a tissue, or an organism under exactly defined conditions. Some factors such as stress or drug effects will change the protein pattern and cause the present or absence of a protein or gradual variation in abundances. Changes in the proteome provide a snapshot of the...
متن کاملPharmaceutical Advances and Proteomics Researches
Proteomics enables understanding the composition, structure, function and interactions of the entire protein complement of a cell, a tissue, or an organism under exactly defined conditions. Some factors such as stress or drug effects will change the protein pattern and cause the present or absence of a protein or gradual variation in abundances. Changes in the proteome provide a snapshot of the...
متن کاملProteomics Applications in Health: Biomarker and Drug Discovery and Food Industry
Advancing in genome sequencing has greatly propelled the understanding of the living world, however, it is insufficient for full description of a biological system. Focusing on, proteomics has emerged as another large-scale platform for improving the understanding of biology. Proteomic experiments can be used for different aspects of clinical and health sciences such as food technology, biomark...
متن کاملProteomics Applications in Health: Biomarker and Drug Discovery and Food Industry
Advancing in genome sequencing has greatly propelled the understanding of the living world, however, it is insufficient for full description of a biological system. Focusing on, proteomics has emerged as another large-scale platform for improving the understanding of biology. Proteomic experiments can be used for different aspects of clinical and health sciences such as food technology, biomark...
متن کاملProteomics Profiling of Chimeric-Truncated Tissue Plasminogen activator Producing- Chinese Hamster Ovary Cells Cultivated in a Chemically Defined Medium Supplemented with Protein Hydrolysates
Background: Culture media enrichment through the addition of protein hydrolysates is beneficial for achieving higher protein expression. Methods: In this study, designing the optimum mixture of four soy and casein-derived hydrolysates was successfully performed by design of experiment and specific productivity increased in all predicted combinations. Protein profile of recombinant CHO (rCHO) ce...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Bioinformatics
دوره 26 24 شماره
صفحات -
تاریخ انتشار 2010